Datacube processing I

Advanced Geospatial Data Processing for Social Scientists

Dennis Abel & Stefan Jünger

2025-04-29

Now

Day Time Title
April 28 10:00-11:15 Introduction
April 28 11:15-11:30 Coffee Break
April 28 11:30-13:00 Raster data in R
April 28 13:00-14:00 Lunch Break
April 28 14:00-15:15 Raster data processing
April 28 15:15-15:30 Coffee Break
April 28 15:30-17:00 Graphical display of raster data in maps
April 29 10:00-11:15 Datacube processing I
April 29 11:15-11:30 Coffee Break
April 29 11:30-13:00 Datacube processing II & API access
April 29 13:00-14:00 Lunch Break
April 29 14:00-15:15 Data integration and linking (with survey data)
April 29 15:15-15:30 Coffee Break
April 29 15:30-17:00 Outlook and open session with own application

Datacubes

Yesterday, we worked with single layers of raster data. Today, we turn towards the more complex raster stacks or datacubes. Essentially we add one or more dimension(s) to our data object. This increases complexity - for humans and machines.

Our goal for today is to:

  • Understand the logic of raster stacks and datacubes
  • Be able to stack multiple raster layers into a coherent raster stack or datacube
  • Be able to navigate and process the spatial and temporal dimensions
  • Familiarize ourselves with the stars and terra packages for raster processing

Datacubes

Source

Most important EOD packages 📦

If you want to do anything in R, you need to use functions, and functions are provided through packages.

terra (and its predecessor raster) by Robert Hijmans is probably the most commonly used package for raster data in R (and equally relevant for vector data).

Methods for vector data include geometric operations such as intersect and buffer. Raster methods include local, focal, global, zonal and geometric operations. The predict and interpolate methods facilitate the use of regression type (interpolation, machine learning) models for spatial prediction, including with satellite remote sensing data.

The authors have produced a very extensive tutorial which is basically a textbook for spatial data science in R.

Most important EOD packages 📦

If you want to do anything in R, you need to use functions, and functions are provided through packages.

stars by Edzer Pebesma is equally useful when handling spatiotemporal arrays (datacubes).

This R package provides classes and methods for reading, manipulating, plotting and writing such data cubes, to the extent that there are proper formats for doing so.

The stars syntax follows the logic of the sf-package (also by Edzer Pebesma) and together they can be seen as the emergence of a “spatial tidyverse”.

Edzer Pebesma’s and Roger Bivand’s Spatial Data Science textbook is THE go-to resource for learning the sf and stars syntax.

Similarities and differences between terra and stars

  • With terra, we create three-dimensional stacks of raster layers, were the third dimension is often (not always) time - in contrast, stars objects can be multi-dimensional and can store more dimensions (e.g. several variables)
  • Major difference: terra doesn’t treat the third dimension (time or band) as a “dimension” in the same way stars does.
  • Time is encoded via layer names or separate metadata in terra objects — you manage it manually.

Informally we will call SpatRaster objects created with terra “raster stacks” and stars objects “raster cubes”.

Stacking raster layers

We will start by manually creating our raster stacks/cubes by combining several layers. We will first show our terra-approach and afterwards replicate the process with stars. These steps include:

  • Combination of separate layers into one object
  • Adjustments to metadata and global attributes like:
    • Time dimension
    • CRS

The output will be a clean raster stack/cube for further processing/analysis.

terra approach: stacking raster layers

We load four layers of population data for California for the years 2017-2020.

CA_pop_2017 <- terra::rast("./data/US-CA_ppp_2017_1km.tif")
CA_pop_2018 <- terra::rast("./data/US-CA_ppp_2018_1km.tif")
CA_pop_2019 <- terra::rast("./data/US-CA_ppp_2019_1km.tif")
CA_pop_2020 <- terra::rast("./data/US-CA_ppp_2020_1km.tif")

class(CA_pop_2017)

CA_pop_2017
[1] "SpatRaster"
attr(,"package")
[1] "terra"
class       : SpatRaster 
dimensions  : 1136, 1234, 1  (nrow, ncol, nlyr)
resolution  : 0.008333333, 0.008333333  (x, y)
extent      : -124.4179, -114.1346, 32.54125, 42.00792  (xmin, xmax, ymin, ymax)
coord. ref. : lon/lat WGS 84 (EPSG:4326) 
source      : US-CA_ppp_2017_1km.tif 
name        : US-CA_ppp_2017_1km 

terra approach: stacking raster layers

We can simply concatenate (c()) the layers to create a raster stack.

CA_pop_stack <- c(CA_pop_2017, CA_pop_2018, CA_pop_2019, CA_pop_2020)

class(CA_pop_stack)
[1] "SpatRaster"
attr(,"package")
[1] "terra"
CA_pop_stack
class       : SpatRaster 
dimensions  : 1136, 1234, 4  (nrow, ncol, nlyr)
resolution  : 0.008333333, 0.008333333  (x, y)
extent      : -124.4179, -114.1346, 32.54125, 42.00792  (xmin, xmax, ymin, ymax)
coord. ref. : lon/lat WGS 84 (EPSG:4326) 
sources     : US-CA_ppp_2017_1km.tif  
              US-CA_ppp_2018_1km.tif  
              US-CA_ppp_2019_1km.tif  
              US-CA_ppp_2020_1km.tif  
names       : US-CA_p~017_1km, US-CA_p~018_1km, US-CA_p~019_1km, US-CA_p~020_1km 

terra approach: stacking raster layers

We can integrate reading and stacking into one step by providing a list of raster layers to the terra::rast() command.

# Stacking more automated
files <- list.files("./data", pattern = "US-CA_ppp_20(17|18|19|20)_1km\\.tif$", full.names = TRUE)
CA_pop_stack <- terra::rast(files)

CA_pop_stack
class       : SpatRaster 
dimensions  : 1136, 1234, 4  (nrow, ncol, nlyr)
resolution  : 0.008333333, 0.008333333  (x, y)
extent      : -124.4179, -114.1346, 32.54125, 42.00792  (xmin, xmax, ymin, ymax)
coord. ref. : lon/lat WGS 84 (EPSG:4326) 
sources     : US-CA_ppp_2017_1km.tif  
              US-CA_ppp_2018_1km.tif  
              US-CA_ppp_2019_1km.tif  
              US-CA_ppp_2020_1km.tif  
names       : US-CA_p~017_1km, US-CA_p~018_1km, US-CA_p~019_1km, US-CA_p~020_1km 

terra approach: stacking raster layers

You know that we can utilize the base R plot function to visualize SpatRaster objects. This is also true for raster stacks.

# Have a first glance at the data
terra::plot(CA_pop_stack)

terra approach: stacking raster layers

Dates can be assigned to the layers. This will effectively create a new attribute in the object IN ADDITION to the name of the variable.

layer_dates <- as.Date(paste0(2017:2020, "-01-01"))
time(CA_pop_stack) <- layer_dates

CA_pop_stack
class       : SpatRaster 
dimensions  : 1136, 1234, 4  (nrow, ncol, nlyr)
resolution  : 0.008333333, 0.008333333  (x, y)
extent      : -124.4179, -114.1346, 32.54125, 42.00792  (xmin, xmax, ymin, ymax)
coord. ref. : lon/lat WGS 84 (EPSG:4326) 
sources     : US-CA_ppp_2017_1km.tif  
              US-CA_ppp_2018_1km.tif  
              US-CA_ppp_2019_1km.tif  
              US-CA_ppp_2020_1km.tif  
names       : US-CA_p~017_1km, US-CA_p~018_1km, US-CA_p~019_1km, US-CA_p~020_1km 
time (days) : 2017-01-01 to 2020-01-01 
time(CA_pop_stack)
[1] "2017-01-01" "2018-01-01" "2019-01-01" "2020-01-01"
timeInfo(CA_pop_stack)
  time step zone
1 TRUE days     

terra approach: stacking raster layers

When plotting the raster stack, by default the machine understands to access the time attribute, not the names, to label the plots.

# Have a first glance at the data
terra::plot(CA_pop_stack)

terra approach: stacking raster layers

When defining the time dimension, we can also store the time zone of the date.

layer_posix <- as.POSIXct(layer_dates, tz = "UTC")

time(CA_pop_stack) <- layer_posix

time(CA_pop_stack)
[1] "2017-01-01 UTC" "2018-01-01 UTC" "2019-01-01 UTC" "2020-01-01 UTC"
timeInfo(CA_pop_stack)
  time    step zone
1 TRUE seconds  UTC

terra approach: stacking raster layers

When working with a yearly resolution, it is often consensus to set the date to the first of January. If you want to avoid confusion and explicitly assign only the year, you can do that by adjusting the tstep= argument in the terra::time-function.

layer_years <- c(2017:2020)
time(CA_pop_stack, tstep = "years") <- layer_years

CA_pop_stack
class       : SpatRaster 
dimensions  : 1136, 1234, 4  (nrow, ncol, nlyr)
resolution  : 0.008333333, 0.008333333  (x, y)
extent      : -124.4179, -114.1346, 32.54125, 42.00792  (xmin, xmax, ymin, ymax)
coord. ref. : lon/lat WGS 84 (EPSG:4326) 
sources     : US-CA_ppp_2017_1km.tif  
              US-CA_ppp_2018_1km.tif  
              US-CA_ppp_2019_1km.tif  
              US-CA_ppp_2020_1km.tif  
names       : US-CA_p~017_1km, US-CA_p~018_1km, US-CA_p~019_1km, US-CA_p~020_1km 
time (years): 2017 to 2020 
time(CA_pop_stack)
[1] 2017 2018 2019 2020
timeInfo(CA_pop_stack)
  time  step zone
1 TRUE years     

terra approach: stacking raster layers

When working with a yearly resolution, it is often consensus to set the date to the first of January. If you want to avoid confusion and explicitly assign only the year, you can do that by adjusting the tstep= argument in the terra::time-function.

# Have a first glance at the data
terra::plot(CA_pop_stack)

terra approach: stacking raster layers

By default, the variable name is derived from the file-name. We can replace it with a simpler version.

# Adjust variable name
names(CA_pop_stack) <- paste0("pop_", layer_years)

print(CA_pop_stack)
class       : SpatRaster 
dimensions  : 1136, 1234, 4  (nrow, ncol, nlyr)
resolution  : 0.008333333, 0.008333333  (x, y)
extent      : -124.4179, -114.1346, 32.54125, 42.00792  (xmin, xmax, ymin, ymax)
coord. ref. : lon/lat WGS 84 (EPSG:4326) 
sources     : US-CA_ppp_2017_1km.tif  
              US-CA_ppp_2018_1km.tif  
              US-CA_ppp_2019_1km.tif  
              US-CA_ppp_2020_1km.tif  
names       : pop_2017, pop_2018, pop_2019, pop_2020 
time (years): 2017 to 2020 

terra approach: stacking raster layers

There are also options to adjust short and long variable names.

# Optional - setting variable names
varnames(CA_pop_stack) <- "population"

longnames(CA_pop_stack) <- "California population estimate (1 km grid)"

print(CA_pop_stack)
class       : SpatRaster 
dimensions  : 1136, 1234, 4  (nrow, ncol, nlyr)
resolution  : 0.008333333, 0.008333333  (x, y)
extent      : -124.4179, -114.1346, 32.54125, 42.00792  (xmin, xmax, ymin, ymax)
coord. ref. : lon/lat WGS 84 (EPSG:4326) 
sources     : US-CA_ppp_2017_1km.tif  
              US-CA_ppp_2018_1km.tif  
              US-CA_ppp_2019_1km.tif  
              US-CA_ppp_2020_1km.tif  
varnames    : population (California population estimate (1 km grid)) 
              population (California population estimate (1 km grid)) 
              population (California population estimate (1 km grid)) 
              ...
names       : pop_2017, pop_2018, pop_2019, pop_2020 
time (years): 2017 to 2020 

terra approach: stacking raster layers

Although in practice, we often don’t do it, there is also the option to explicitly set the value unit.

# Setting units
units(CA_pop_stack)
[1] "" "" "" ""
units(CA_pop_stack) <- "count"

units(CA_pop_stack)
[1] "count" "count" "count" "count"

terra approach: stacking raster layers

For a quick glance at the distribution of the data, it is helpful to add minimum and maximum values to the metadata. By default not displayed because first reading raster from disk just creates a pointer object which does not read all cell values.

setMinMax(CA_pop_stack, force=TRUE)

print(CA_pop_stack)
class       : SpatRaster 
dimensions  : 1136, 1234, 4  (nrow, ncol, nlyr)
resolution  : 0.008333333, 0.008333333  (x, y)
extent      : -124.4179, -114.1346, 32.54125, 42.00792  (xmin, xmax, ymin, ymax)
coord. ref. : lon/lat WGS 84 (EPSG:4326) 
sources     : US-CA_ppp_2017_1km.tif  
              US-CA_ppp_2018_1km.tif  
              US-CA_ppp_2019_1km.tif  
              US-CA_ppp_2020_1km.tif  
varnames    : population (California population estimate (1 km grid)) 
              population (California population estimate (1 km grid)) 
              population (California population estimate (1 km grid)) 
              ...
names       : pop_2017, pop_2018, pop_2019, pop_2020 
min values  :     0.00,        0,     0.00,     0.00 
max values  : 29224.54,    29316, 29433.52, 29535.46 
unit        :    count,    count,    count,    count 
time (years): 2017 to 2020 

terra approach: stacking raster layers

With the crs-function, we can directly access the CRS of the object. You can assign a label to the crs of the stack (for example in case it does not have one). IMPORTANT: Keep in mind that this DOES NOT reproject the data.

crs(CA_pop_stack)
[1] "GEOGCRS[\"WGS 84\",\n    ENSEMBLE[\"World Geodetic System 1984 ensemble\",\n        MEMBER[\"World Geodetic System 1984 (Transit)\"],\n        MEMBER[\"World Geodetic System 1984 (G730)\"],\n        MEMBER[\"World Geodetic System 1984 (G873)\"],\n        MEMBER[\"World Geodetic System 1984 (G1150)\"],\n        MEMBER[\"World Geodetic System 1984 (G1674)\"],\n        MEMBER[\"World Geodetic System 1984 (G1762)\"],\n        MEMBER[\"World Geodetic System 1984 (G2139)\"],\n        MEMBER[\"World Geodetic System 1984 (G2296)\"],\n        ELLIPSOID[\"WGS 84\",6378137,298.257223563,\n            LENGTHUNIT[\"metre\",1]],\n        ENSEMBLEACCURACY[2.0]],\n    PRIMEM[\"Greenwich\",0,\n        ANGLEUNIT[\"degree\",0.0174532925199433]],\n    CS[ellipsoidal,2],\n        AXIS[\"geodetic latitude (Lat)\",north,\n            ORDER[1],\n            ANGLEUNIT[\"degree\",0.0174532925199433]],\n        AXIS[\"geodetic longitude (Lon)\",east,\n            ORDER[2],\n            ANGLEUNIT[\"degree\",0.0174532925199433]],\n    USAGE[\n        SCOPE[\"Horizontal component of 3D system.\"],\n        AREA[\"World.\"],\n        BBOX[-90,-180,90,180]],\n    ID[\"EPSG\",4326]]"
# crs(CA_pop_stack) <- "EPSG:4326"

terra approach: stacking raster layers

If you want to reproject the data, you can utilize project as before. Here, we transform to NAD83 / California Albers (EPSG:3310).

CA_pop_stack <- terra::project(CA_pop_stack, 
                               y = "EPSG:3310",
                               method = "bilinear"
                               )

print(CA_pop_stack)
class       : SpatRaster 
dimensions  : 1287, 1168, 4  (nrow, ncol, nlyr)
resolution  : 828.2603, 828.2603  (x, y)
extent      : -415531, 551877.1, -607609.6, 458361.4  (xmin, xmax, ymin, ymax)
coord. ref. : NAD83 / California Albers (EPSG:3310) 
source(s)   : memory
names       : pop_2017, pop_2018, pop_2019, pop_2020 
min values  :      0.0,     0.00,     0.00,     0.00 
max values  :  22052.6, 22123.96, 22208.66, 22285.26 
unit        :    count,    count,    count,    count 
time (years): 2017 to 2020 

terra approach: stacking raster layers

If you want to reproject the data, you can utilize project as before. Here, we transform to NAD83 / California Albers (EPSG:3310).

# Have a first glance at the data
terra::plot(CA_pop_stack)

Note on reprojecting raster data

Transforming (projecting) raster data is fundamentally different from transforming vector data. Vector data can be transformed and back-transformed without loss in precision and without changes in the values. This is not the case with raster data. In each transformation the values for the new cells are estimated in some fashion. Therefore, if you need to match raster and vector data for analysis, you should generally transform the vector data.

terra approach: stacking raster layers

Now that we have properly prepared variable names and times, we have several options to access single layers.

# Indexing by layer index number
CA_pop_stack[[1]]
class       : SpatRaster 
dimensions  : 1287, 1168, 1  (nrow, ncol, nlyr)
resolution  : 828.2603, 828.2603  (x, y)
extent      : -415531, 551877.1, -607609.6, 458361.4  (xmin, xmax, ymin, ymax)
coord. ref. : NAD83 / California Albers (EPSG:3310) 
source(s)   : memory
name        : pop_2017 
min value   :      0.0 
max value   :  22052.6 
unit        :    count 
time (years): 2017 
# Indexing by layer name
CA_pop_stack["pop_2018"]
class       : SpatRaster 
dimensions  : 1287, 1168, 1  (nrow, ncol, nlyr)
resolution  : 828.2603, 828.2603  (x, y)
extent      : -415531, 551877.1, -607609.6, 458361.4  (xmin, xmax, ymin, ymax)
coord. ref. : NAD83 / California Albers (EPSG:3310) 
source(s)   : memory
name        : pop_2018 
min value   :     0.00 
max value   : 22123.96 
unit        :    count 
time (years): 2018 
# Indexing by time point
CA_pop_stack["2019"]
class       : SpatRaster 
dimensions  : 1287, 1168, 1  (nrow, ncol, nlyr)
resolution  : 828.2603, 828.2603  (x, y)
extent      : -415531, 551877.1, -607609.6, 458361.4  (xmin, xmax, ymin, ymax)
coord. ref. : NAD83 / California Albers (EPSG:3310) 
source(s)   : memory
name        : pop_2019 
min value   :     0.00 
max value   : 22208.66 
unit        :    count 
time (years): 2019 

terra approach: stacking raster layers

Now that we have properly prepared variable names and times, we have several options to access single layers.

# Plot a single layer
terra::plot(CA_pop_stack[[1]])

terra approach: stacking raster layers

Exporting your object is the same for raster stacks.

# Export
writeRaster(
  CA_pop_stack,
  "./data/CA_pop_stack.tif",
  overwrite=TRUE,
  gdal=c("COMPRESS=LZW","BIGTIFF=YES")
)

Exercise 5_1: A raster stack with terra

Exercise

stars approach: stacking raster layers

We will now repeat this process with stars. We load four layers of population data for California for the years 2017-2020 with read_stars.

CA_pop_2017 <- read_stars("./data/US-CA_ppp_2017_1km.tif")
CA_pop_2018 <- read_stars("./data/US-CA_ppp_2018_1km.tif")
CA_pop_2019 <- read_stars("./data/US-CA_ppp_2019_1km.tif")
CA_pop_2020 <- read_stars("./data/US-CA_ppp_2020_1km.tif")

class(CA_pop_2017)

print(CA_pop_2017)
[1] "stars"
stars object with 2 dimensions and 1 attribute
attribute(s), summary of first 1e+05 cells:
                        Min. 1st Qu. Median     Mean    3rd Qu.     Max.  NA's
US-CA_ppp_2017_1km.tif     0       0      0 1.884433 0.09498084 1906.748 60820
dimension(s):
  from   to offset     delta refsys point x/y
x    1 1234 -124.4  0.008333 WGS 84 FALSE [x]
y    1 1136  42.01 -0.008333 WGS 84 FALSE [y]

stars approach: stacking raster layers

Concatenating (c()) stars layers also works.

CA_pop_cube <- c(CA_pop_2017, CA_pop_2018, CA_pop_2019, CA_pop_2020)

class(CA_pop_cube)
[1] "stars"
print(CA_pop_cube)
stars object with 2 dimensions and 4 attributes
attribute(s), summary of first 1e+05 cells:
                        Min. 1st Qu. Median     Mean    3rd Qu.     Max.  NA's
US-CA_ppp_2017_1km.tif     0       0      0 1.884433 0.09498084 1906.748 60820
US-CA_ppp_2018_1km.tif     0       0      0 1.889280 0.09474182 1919.257 60820
US-CA_ppp_2019_1km.tif     0       0      0 1.894074 0.09462410 1933.029 60820
US-CA_ppp_2020_1km.tif     0       0      0 1.898908 0.09593478 1953.220 60820
dimension(s):
  from   to offset     delta refsys point x/y
x    1 1234 -124.4  0.008333 WGS 84 FALSE [x]
y    1 1136  42.01 -0.008333 WGS 84 FALSE [y]

stars approach: stacking raster layers

Again, integrating these steps is more efficient.

BUT WAIT! This generates four separate attributes. This is not what we want!

files <- list.files("./data",
                    pattern = "US-CA_ppp_20(17|18|19|20)_1km\\.tif$",
                    full.names = TRUE)
CA_pop_cube <- read_stars(files)

print(CA_pop_cube)
stars object with 2 dimensions and 4 attributes
attribute(s), summary of first 1e+05 cells:
    Min. 1st Qu. Median     Mean    3rd Qu.     Max.  NA's
17     0       0      0 1.884433 0.09498084 1906.748 60820
18     0       0      0 1.889280 0.09474182 1919.257 60820
19     0       0      0 1.894074 0.09462410 1933.029 60820
20     0       0      0 1.898908 0.09593478 1953.220 60820
dimension(s):
  from   to offset     delta refsys point x/y
x    1 1234 -124.4  0.008333 WGS 84 FALSE [x]
y    1 1136  42.01 -0.008333 WGS 84 FALSE [y]

stars approach: stacking raster layers

We want to integrate the values into one attribute since it is the same variable.

In order to do that, we need to create the time-dimension first.

dates <- as.Date(paste0(2017:2020, "-01-01"))

stars approach: stacking raster layers

We can apply the merge-function on our existing datacube to integrate the four attributes into one by supplying the dates:

cube_merged <- merge(CA_pop_cube, f = dates, name = "time")

print(cube_merged)
stars object with 3 dimensions and 1 attribute
attribute(s), summary of first 1e+05 cells:
             Min. 1st Qu. Median     Mean    3rd Qu.     Max.  NA's
17.18.19.20     0       0      0 1.884433 0.09498084 1906.748 60820
dimension(s):
  from   to     offset     delta refsys point x/y
x    1 1234     -124.4  0.008333 WGS 84 FALSE [x]
y    1 1136      42.01 -0.008333 WGS 84 FALSE [y]
f    1    4 2017-01-01  365 days   Date    NA    

stars approach: stacking raster layers

A more straightforward approach is to read the layers directly into one object based on the supplied date-variable:

files <- list.files("./data",
                    pattern = "US-CA_ppp_20(17|18|19|20)_1km\\.tif$",
                    full.names = TRUE)

# Read layers along time dimension
CA_pop_cube <- read_stars(files, along = list(time = dates))

print(CA_pop_cube)
stars object with 3 dimensions and 1 attribute
attribute(s), summary of first 1e+05 cells:
                        Min. 1st Qu. Median     Mean    3rd Qu.     Max.  NA's
US-CA_ppp_2017_1km.tif     0       0      0 1.884433 0.09498084 1906.748 60820
dimension(s):
     from   to     offset     delta refsys point x/y
x       1 1234     -124.4  0.008333 WGS 84 FALSE [x]
y       1 1136      42.01 -0.008333 WGS 84 FALSE [y]
time    1    4 2017-01-01  365 days   Date    NA    

stars approach: stacking raster layers

The st_dimensions()and st_set_dimensions() functions are super helpful to access and manipulate dimension information. For example, if we want to transform the date into POSIX and assign a timezone, we can do it like this:

layer_posix <- as.POSIXct(dates, tz = "UTC")
CA_pop_cube <- st_set_dimensions(CA_pop_cube,
                                 which = "time",
                                 values = layer_posix)

st_dimensions(CA_pop_cube)$time
$from
[1] 1

$to
[1] 4

$offset
[1] "2017-01-01 UTC"

$delta
Time difference of 365 days

$refsys
[1] "POSIXct"

$point
[1] NA

$values
NULL

attr(,"class")
[1] "dimension"

stars approach: stacking raster layers

We can also simplify the dimension by assigning years as numeric values.

CA_pop_cube <- st_set_dimensions(CA_pop_cube,
                          which = "time",
                          values = 2017:2020)

st_dimensions(CA_pop_cube)$time
$from
[1] 1

$to
[1] 4

$offset
[1] 2017

$delta
[1] 1

$refsys
[1] NA

$point
[1] NA

$values
NULL

attr(,"class")
[1] "dimension"

stars approach: stacking raster layers

In stars lingo, our variable is called “attribute”. We can relabel this name as well.

names(CA_pop_cube)
[1] "US-CA_ppp_2017_1km.tif"
names(CA_pop_cube) <- "population"

print(CA_pop_cube)
stars object with 3 dimensions and 1 attribute
attribute(s), summary of first 1e+05 cells:
            Min. 1st Qu. Median     Mean    3rd Qu.     Max.  NA's
population     0       0      0 1.884433 0.09498084 1906.748 60820
dimension(s):
     from   to offset     delta refsys point x/y
x       1 1234 -124.4  0.008333 WGS 84 FALSE [x]
y       1 1136  42.01 -0.008333 WGS 84 FALSE [y]
time    1    4   2017         1     NA    NA    

stars approach: stacking raster layers

Setting units in stars is not that straightforward and requires a “manual” and quite complicated workaround. Normally, we wouldn’t recommend that.

pop_array <- CA_pop_cube[["population"]]

valid_units <- valid_udunits()

pop_array <- set_units(pop_array, "count")

CA_pop_cube[["population"]] <- pop_array

print(CA_pop_cube)
stars object with 3 dimensions and 1 attribute
attribute(s), summary of first 1e+05 cells:
                   Min. 1st Qu. Median     Mean    3rd Qu.     Max.  NA's
population [count]    0       0      0 1.884433 0.09498084 1906.748 60820
dimension(s):
     from   to offset     delta refsys point x/y
x       1 1234 -124.4  0.008333 WGS 84 FALSE [x]
y       1 1136  42.01 -0.008333 WGS 84 FALSE [y]
time    1    4   2017         1     NA    NA    

stars approach: stacking raster layers

Access to the CRS is easy with st_crs. The EPSG code is not supplied as string (“EPSG:XXXX”) but as numeric values.

# CRS
st_crs(CA_pop_cube)
Coordinate Reference System:
  User input: WGS 84 
  wkt:
GEOGCRS["WGS 84",
    ENSEMBLE["World Geodetic System 1984 ensemble",
        MEMBER["World Geodetic System 1984 (Transit)"],
        MEMBER["World Geodetic System 1984 (G730)"],
        MEMBER["World Geodetic System 1984 (G873)"],
        MEMBER["World Geodetic System 1984 (G1150)"],
        MEMBER["World Geodetic System 1984 (G1674)"],
        MEMBER["World Geodetic System 1984 (G1762)"],
        MEMBER["World Geodetic System 1984 (G2139)"],
        MEMBER["World Geodetic System 1984 (G2296)"],
        ELLIPSOID["WGS 84",6378137,298.257223563,
            LENGTHUNIT["metre",1]],
        ENSEMBLEACCURACY[2.0]],
    PRIMEM["Greenwich",0,
        ANGLEUNIT["degree",0.0174532925199433]],
    CS[ellipsoidal,2],
        AXIS["geodetic latitude (Lat)",north,
            ORDER[1],
            ANGLEUNIT["degree",0.0174532925199433]],
        AXIS["geodetic longitude (Lon)",east,
            ORDER[2],
            ANGLEUNIT["degree",0.0174532925199433]],
    USAGE[
        SCOPE["Horizontal component of 3D system."],
        AREA["World."],
        BBOX[-90,-180,90,180]],
    ID["EPSG",4326]]
# to label is missing - not to reproject
# st_crs(CA_pop_cube) <- 4326

stars approach: stacking raster layers

We can reproject the data with st_warp(). Default of use_gdal argument is TRUE. Unfortunately, this will remove our time-dimension metadata assigned earlier. Set it to FALSE to preserve that. This has the downside that the native stars warp is limited to nearest-neighbor resampling and slower. Alternative is to reassign time-metadata afterwards.

CA_pop_cube <- st_warp(CA_pop_cube, 
                       crs = 3310,
                       cellsize = NA_real_,
                       use_gdal = FALSE # to preserve time-dimension metadata
                       )

print(CA_pop_cube)
stars object with 3 dimensions and 1 attribute
attribute(s), summary of first 1e+05 cells:
                   Min. 1st Qu. Median     Mean   3rd Qu.     Max.  NA's
population [count]    0       0      0 1.894517 0.1030732 1906.748 71328
dimension(s):
     from   to  offset  delta                    refsys x/y
x       1 1171 -415531  826.2 NAD83 / California Albers [x]
y       1 1279  458361 -826.2 NAD83 / California Albers [y]
time    1    4    2017      1                        NA    

stars approach: stacking raster layers

We index time layers by index number.

CA_pop_cube[,,,1]
stars object with 3 dimensions and 1 attribute
attribute(s), summary of first 1e+05 cells:
                   Min. 1st Qu. Median     Mean   3rd Qu.     Max.  NA's
population [count]    0       0      0 1.894517 0.1030732 1906.748 71328
dimension(s):
     from   to  offset  delta                    refsys x/y
x       1 1171 -415531  826.2 NAD83 / California Albers [x]
y       1 1279  458361 -826.2 NAD83 / California Albers [y]
time    1    1    2017      1                        NA    

stars approach: stacking raster layers

There is also a native export function for stars objects.

write_stars(CA_pop_cube,
            "./data/CA_pop_cube.tif",
            driver = "GTiff",
            options = c("COMPRESS=LZW", "BIGTIFF=YES")
            )

Converting between terra and stars

NOCHMAL CHECKEN - HIER IST WAS FALSCH In theory, converting between terra and stars objects is simple. This will preserve most crucial information on cell values, spatial extent, and CRS. It can, however, mess around with your neatly prepared metadata.

# Tranform terra SpatRaster into stars object
# CA_pop_stack_stars <- stars::st_as_stars(CA_pop_stack)
# 
# class(CA_pop_stack_stars)
# 
# print(CA_pop_stack_stars)

Converting between terra and stars

In theory, converting between terra and stars objects is simple. This will preserve most crucial information on cell values, spatial extent, and CRS. It can, however, mess around with your neatly prepared metadata.

# Tranform stars object into terra SpatRaster
# CA_pop_cube_terra <- terra::rast(CA_pop_cube)
# 
# class(CA_pop_cube_terra)
# 
# print(CA_pop_cube_terra)

Exercise 5_2: A raster cube with stars

Exercise